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Abstract 

In the problem of influence maximization in information networks, the objective is to choose a set of 
initially active nodes subject to some budget constraints such that the expected number of active nodes over 
time is maximized. 

The linear threshold model has been introduced to study the opinion cascading behavior, for instance, the 
spread of products and innovations. In the existing studies, the study of the linear threshold model mainly 
focus on the progressive case, in which once a user became active, it is forced to be active forever. In this 
paper, we consider the non-progressive case in which active nodes might become inactive in subsequent time 
steps. This setting makes it possible to model the users’ dynamic behavior, and consequently fit better to the 
situation of continuous consumption in the daily life. Previous works on the non-progressive case assumed 
that the thresholds indicating the susceptibilities of the individuals change randomly and independently at 
every time step. We argue that an individual’s susceptibility should be consistent, and hence it is more 
realistic to consider the case in which an individual’s susceptibility is chosen initially at random, but then 
remains the same throughout the process. This setting causes more completeness for the analysis, as for any 
two nodes that share a common ancestor, their status are no longer independent. 

In this paper, we we extends the classic linear threshold model mi to capture the non-progressive be¬ 
havior. The information maximization problem under our model is proved to be NP-Hard, even for the case 
when the underlying network has no directed cycles. The first result of this paper is negative. In general, 
the objective function of the extended linear threshold model is no longer submodular, and hence the hill 
climbing approach that is commonly used in the existing studies is not applicable. Next, as the main result of 
this paper, we prove that if the underlying information network is directed acyclic, the objective function is 
submodular (and monotone). Therefore, in directed acyclic networks with a specified budget we can achieve 
^-approximation on maximizing the number of active nodes over a certain period of time by a deterministic 
algorithm, and achieve the (1 — -^approximation by a randomized algorithm. 


1 Introduction 

We consider the problem of an advertiser promoting a product in a social network. The idea of viral marketing 0 
HZ] is that with a limited budget, the advertiser can persuade only a subset of individuals to use the new product, 
perhaps by giving out a limited number of free samples. Then the popularity of the product is spread by word- 
of-mouth, i.e. through the existing connections between users in the underlying social network. 

Information networks have been used to model such cascading behavior [231 HL S H2J D IS HS|- An 
information network is a directed edge-weighted graph, in which a node represents a user, whose behavior is 
influenced by its (outgoing) neighbors, and the weight of an edge reflects how influential the corresponding 
neighbor is. A node adopting the new behavior is active and is otherwise inactive. The threshold model HSUH] 
is one way to model the spread of the new behavior. The resistance of a node v to adopt the new behavior is 
represented by a random threshold 9 V (higher value means higher resistance), where the randomness is used 
to model the different susceptibility of different users. The new behavior is spread in the information network 
in discrete time steps. An inactive node changes its state to active if the weighted influence from the active 
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neighbors in the previous time step reaches its threshold. We consider the non-progressive case where an active 
node could revert back to the inactive state if the influence from its neighbors drops below its threshold. 

Our Contribution. We consider the non-progressive linear threshold model in this paper, which is the 
natural extension of the well known linear threshold model m- In Section [2j the formal definition of the 
non-progressive linear threshold model is introduced, as well as the influence maximization problem. In most 
existing works, for a set of initially active nodes, the influence is measured by the maximum number of active 
nodes. Since the existing works consider the progressive case, hence the number of active nodes increases step 
by step and achieves the maximum after at most n time steps, where n is the number of the nodes. However, 
in the non-progressive case, it is possible that the active status never become stable. Hence, we introduce the 
average number of active nodes over a time period to measure the influence. Similar to the progressive case, 
the influence maximization problem considering the non-progressive linear threshold model is also NP-hard 
(Section [3]). In order to approximate the optimal within a constant factor, a commonly used approach is to 
prove the monotone and submodular property of the objective function. Then the constant approximation 
ratio algorithms are promised by using the results of Fisher et al. |T2S and Calinescu et al. [3]. However, this 
approach is not generally applicable for the non-progressive linear threshold model, since as showed in Section]!] 
the average number of active nodes is possibly not submodular. As the main result of this paper, we studied 
the case when the information network is acyclic. As consistent with the intuition, the expected influence under 
the acyclic networks is submodular and hence Fisher’s technique (and Calinescu’s technique) is applicable to 
achieve the constant approximation. It should be noted that although the acyclic case looks much simpler than 
the general ones (where directed cycles may exist), the solution to maximize the expected influence is not that 
easy. As it is proved in Section [3j the problem of influence maximization is still NP-Hard even for the case 
under acyclic information networks. Futhermore, to prove the submodularity of the expected influence, we still 
need some tricky technique (in this paper, we ) to handle complicated association between the status of the 
nodes. To see the association, consider time t > 0, the status of nodes at time t are associated if they share 
some common ancestors and the threshold of such an ancestor affects the status of its descendants. As this kind 
of association exist, it requires more carefully consideration of the nodes status and the analysis consequently 
become more complicate. In Section [5j we introduce an equivalent process (called Path Effect) for the non¬ 
progressive linear threshold model, and then the deep connection between this process and the random walk is 
proved via a coupling technique, which consequently leads to our final conclusion of the submodularity of the 
expected influence (under non-progressive linear threshold model). 

Related Works. The cascading behavior in information networks was first studied in the computer science 
community by Kempe, Kleinberg and Tardos [18]. They considered the Independent Cascade Model and the 
Linear Threshold Model, the latter of which we generalize in this paper. Their main focus was the progressive 
case, and only reduced the non-progressive case to the progressive one by assigning a new independent random 
threshold to each node at every time step such that the resulting objective function is still submodular. 

Kempe et al. H3 EU have also shown that the influence maximization problem in such models is NP- 
hard. Researchers often first show that the objective functions in question are submodular and then apply 
submodular function maximization methods to obtain constant approximation ratio. An example of such 
methods is the Standard Greedy Algorithm , which is analyzed by by Nemhauser and Fisher et al. MM- 
Loosely speaking, the Standard Greedy Algorithm (also known as the Hill Climbing Algorithm ) starts with an 
empty solution, and in each iteration while there is still enough budget, we expand the current solution by 
including an additional node that causes the greatest increase in the objective function. Although the costs 
for transient and permanent nodes are different in our model, the budget constraint can still be described by a 
matroid. Under the matroid constraint, Fisher et al. [15] showed that the Standard Greedy Algorithm achieves 
^-approximation, and Calinescu et al. [3] introduced a randomized algorithm that achieves (1— ^-approximation 
in expectation. 

In the above submodular function maximization algorithms, the objective function needs to be accessed in 
each iteration. However, to calculate the exact value of the objective function is in general hard [7]. One way to 
resolve this is to estimate the value of the objective function by sampling. Some works have used other ways to 
overcome this issue. To improve the efficiency of the Standard Greedy Algorithm, Leskovec et al. [20] showed 
a Cost-Effective Lazy Forward scheme, which makes use of the submodularity of the objective function and 
avoids the evaluation of influence on those nodes for which the incremental influence in the previous iteration 
is less than that of some already evaluated node in the current iteration. This scheme has been shown more 
efficient than the Standard Greedy Algorithm by experiments. Chen et al. [6j also designed an improved scheme 
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to speed up the Standard Greedy Algorithm by using some efficiently computable heuristics that have similar 
performance. 

Chen et al. [5] considered how positive and negative opinions spread in the same network, which can be 
interpreted as the influence process involving two agents. The influence maximization problem considering 
multiple competing agents in an information network has also been studied in (T6l ITOl [21 0]. We follow a 
similar setting in which a new comer can observe the strategies of existing agents, and stategizes accordingly to 
maximize his influence in the network. 

Mossel and Roch [21] have shown that under more general submodular threshold functions (as opposed 
to linear threshold functions), the objective function is still submodular and hence the same maximization 
framework can still be applied. 

Similar to our approach, the relationship between influence spreading and random walks has been investi¬ 
gated by Asavathiratham et al. [I], and Even-Dar and Shapira m in other information network models. 

2 Preliminaries 

Definition 1 (Information Network). An information network is a directed weighted graph G = (V,E,b) 
with node set V and edge set E, where each edge (v,u) £ E has some positive weight 0 < b vu < 1 which 
intuitively represents the influencing power of u on v. Denote the set of outgoing neighbors of a node v by 
T(u) := {u £ V | (v,u) £ E}. In addition, for each v £ V, the total weight of its outgoing edges is at most l, 
i-e. X)tier(«) ^vu < 1- 

Without loss of generality, we assume that in the considered information network G, r(u) is exactly 
1 for every node v £ V. To achieve this requirement, for any given G, we add a void node d and for each node 
v d, we include (v, d) into the set of edges, and set b v d := 1 — SuerG)\{d} We can add a self loop with 
weight 1 at the void node d. Furthermore, d is never allowed to be active initially, and hence will never be 
active. Unless explicitly specified, when we use the term node in general, we mean a node other than the void 
node. 

Next, we formally describe the extension of the classic linear threshold model for the adaption of the non¬ 
progressive behavior. A new feature of our model is that an initially active node can be either transient or 
permanent. 

Model 1 (Non-progressive Linear Threshold Model (NLT)). Consider an information network G = (V, E, b). 
Each node in V is associated with a threshold 9 V , which is chosen from (0,1) independently and uniformly at 
random. At time t > 0, every node v is either active A or inactive A f. Denote the set of active nodes at time 
t by A t . In the influence process, given a transient initial set A C V, a permanent initial set A C V, and a 
configuration of thresholds 8 = {9 V }„ e y, the nodes update their status according to the following rules. 

1. At time t = 0, A 0 := A U A. 

2. At time t > 0, for each node v £ V \ A, compute the activation function f v {A t _f) := SueA t _inr(u) ^vu- 
Then let A t := {u £ V \ A\f v (A t _i) > 9 V } U A. 

Without loss of generality, we can assume A n A = 0, otherwise we can use A \ A as the transient initial set 
instead. Given a transient initial set A and a permanent initial set A , we measure the influence of the agent by 
the average number of active nodes over T time steps, where T is some pre-specified time scope in which the 
process evolves. Observe that once the initial sets and the configuration of the thresholds are given, the active 
sets A t ’s are totally determined. 

Definition 2 (Influence Function and Expected Influence). Given an information network G, a transient 
initial set A, a permanent initial set A, and a configuration 9 of thresholds, the average influence over time 
period [1, T\ is defined as a ( A , A) := y X^t=i I A t \. For simplicity, we ignore the superscript [1, T] in a when 
the target period is clear from the context. We define the expected influence a as the expectation of <rg(A, A) 
over the random choice of 9, i.e., a (A, A) := E g[ag(A, A)]. 

Definition 3 (Influence Maximization Problem). In an information network G, suppose the advertising 
cost of a transient initial node is c and that of a permanent initial node is c, where the costs are uniform over 
the nodes. Given a budget K, the goal is to find a transient initial set A and a permanent initial set A with 
total cost c ■ | A\ + c- |j4| at most K such that <r(A, A) is maximized. 
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The most technical part of the paper is to show that the objective function cr is submodular so that the 
maximization techniques of Fisher et al. m can be applied. 

Definition 4 (Submodular, Monotone). A function f : 2 V —> R is submodular if for any A C B C V and 
w £ V \ B, f(B U {u>}) — f(B) < f(A U {u>}) — f{A) holds. A function f is monotone if for any A C B, 
f(A) < f(B). A function g : 2 V x 2 V — t R is submodular (monotone), if keeping one argument constant, the 
function is submodular (monotone) as a function on the other argument. 

In order to facilitate the analysis of the influence process, we define indicator variables to consider the 
behavior of individual nodes at every time step. 

Definition 5 (Indicator Variable). In an information network G, given a transient initial set A and a 
permanent initial set A, a node v and a time t, let X({A, A) be the indicator random variable that takes value 
1 if node v is active at time t, and 0 otherwise. When A = 0, we sometimes write X((A) := X((A, 0). 

The indicator variable’s usefulness is based on the following equality: 


a(A, A) = E 


t£\* 


t =i 




t =1 vGV 


Hence, if the function (A, A) H > E [X^{A/A)\ is submodular and monotone, then so is a. 


3 Hardness of Maximization Problem 

We outline an NP-hardness proof for the maximization problem in our setting via a reduction from vertex cover 
similar to that in annul). We show that the problem is still NP-hard, even for the special case when the 
network is acyclic, and each transient node and each permanent has the same cost, which means only permanent 
nodes will be used. 

Theorem 1. The influence maximization problem under the non-progressive linear threshold model is NP- 
hard even when the network is a directed acyclic graph and all the initially active nodes are permanent. 

Proof. Given an undirected graph with n vertices, we pick an arbitrary linear ordering of the nodes and direct 
each edge accordingly to form a directed acyclic graph. We add a dummy node and for nodes with no outgoing 
edges, we add an edge from it to the dummy node. Hence, the network has n + 1 nodes. For each node, the 
weights of its outgoing edges are distributed uniformly. The number T of time steps under consideration is 1. 

We claim that there is a vertex cover of size k for the constructed network iff there is a permanent initial 
set A of size k + 1 such that ct(0, A) = n + 1. 

Suppose there is a vertex cover S of size k, then adding the dummy node to S to form A as the permanent 
initial set, all nodes will be active in the next time step with probability 1, and so <r(0,4) = n + 1. On the 
other hand, if the permanent initial set A has size k + 1 and cr(0, A) = n + 1, then the dummy node must be 
in A, and suppose S the set of non-dummy nodes in A. If S does not form a vertex cover for the given graph, 
then there exists some edge (u,v), where both nodes u and v are inactive initially, and hence the probability 
that u is active in the next time step is strictly smaller than 1. □ □ 

4 Information Network with Directed Cycles 

In this section, we show that for cases where the information network has directed cycles, the expected influence 
function is not submodular in general. Before describing the example that counters the submodularity, we 
introduce a conclusion that assists the argument. 

Theorem 2. Given an information network G, if A i—>• E[V*(H)] is submodular for every node v £ V and 
time t > 0, then the expected influence is submodular. On the other hand, if there exists a node v* £ V such 
that 

±£e[X*.(A)] 

1 t=l 

is not submodular, then by modifying G, we can construct an information network, for which the expected 
influence over period [1, X 1 ] is not submodular. 
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Proof. The first statement is correct by observing that, 


E 


?X> 


ty;£EK(A)]. 

t -1 vev 


Consider the second statement. We add a set L of nodes outside V to the graph, and for any node v G L, we 
include an edge (v,v*) with weight 1. Observe that this magnifies the effect of v in the network. When \L\ is 
large enough, the new expected influence is not a submodular function. □ □ 

Next, we are ready to describe the counter examples for the submodularity, 

Theorem 3 (Non-subnrodularity of Expected Influence). There exists an information network with directed 
cycles, for which the expected influence under NLT model is not submodular. 

Proof. Based on Theorem [2j it is sufficient to show that there exists an information network, which contains a 
node v*, such that the function A >->• L J2t= l (.A)] is not submodular. 

Consider the information network in Fig |TJ)a) . Each edge (v,u) is marked with its weight b vu . 



Figure 1: Counter examples for submodularity. 

Let S := {cc}, T := { x , y} and focus on node v*. Then it is easy to check the following facts. 

1. For any even time t > 4, E[X*. (S U {z})] - E[X£. ( S )] < E[X*. (T U {z})] - E[X*. (T)]. 

2. For any odd time t > 5, E[X*„ (S U { 2 })] - E[X*. (S')] = E[X*. (T U {z})] - E[X*. (T)]. 

This implies the function A 1 —> YlJ= 1 E[X*»(A.)]/T is not submodular when T is large enough. □ □ 

Theorem 4 (Networks with Self-loops). There exists an information network in which the only directed 
cycle is a self-loop (on a non-void node), such that the according expected influence under NLT model is not 
submodular. 

Proof. Based on Theorem [2j it is sufficient to show that there exists an information network, which contains a 
node v *, such that the function A 1 —>■ Ylt=i E[X*»(A.)]/T is not submodular. Actually, we only need to find a 
network in which there is a node v* and a particular time step t, such that A 1 —> E[A'**(A)] is not submodular. 
Consider the information network in Figjljb). Each edge ( 1 i,u) is marked with the weight b vu . Let S := {x}, 
T := { x , w, z}, and focus on node v*. Then it is easy to check that for time t = 2, 

E[X‘, (S U M)] - E[X‘, (5)] < E [Xl, (T U {y})] - E[X‘, (T)]. 

Thus, A i-)- E[X^» (A)] is not submodular. □ £3! 
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5 Acyclic Information Networks 

In this section, we consider the information networks without directed cycles. As we proved in Section |3j the 
influence maximization problem under NLT model is still NP-Hard even when the underlying network has no 
directed cycles. 

Note that with the assumption of acyclic information networks, for any node v £ V other than the void 
node, the set of its outgoing neighbors r('u) has no directed path back to v. Q Intuitively, during the influence 
procedure, v’s choice of threshold 9 V can never affect the states of nodes in r(u). To describe this fact formally, 
we introduce a random object on 17, where it is the sample space, which is essentially the set of all possible 
configurations of the thresholds. 

Definition 6 (States of Nodes over Time). Suppose A is the sample space, and W is a subset of nodes. 

Define a random object Aw '■ f7 —> {A, J\[}\ w \ xT , such that for w £ A, v £ W and 1 < t < T, Aw(aj)(v,t) 

indicates the state of node v at time t at the sample point ui. 

Lemma 1 (Independence). Suppose v £ V and rj £ [0,1], and let W be any subset of V with no directed 

path to v. Then, we have Pr[6 v < r) \ Aw] = r], under NLT model. 

Proof. In NLT model, the randomness comes from the choices of thresholds 0 = {9 v } v& v- The sample space is 
actually the set of all possible configurations of thresholds 6. 

Note that, the event 0 V < p is totally determined by the choice of 9 V , and the value of II w is totally 
determined by the choice of all 0„’s such that u ^ v. From the description of NLT model, the choices of 
thresholds are independent over different nodes. This implies that, for any Q in the range of 11^, the events 
d v < r) and II w = Q are independent. 

Hence, we get Pr[0 v < rj | Aw] = Pr[0 v <rj\ = r). □ □ 

5.1 Connection to The Random Walk 

Consider time t > 0. The status of nodes at time t may be associated, since they can share some common 
ancestors and the threshold of such an ancestor affects the status of its descendants. This association between 
different nodes cause more complicates for the analysis. In order to assist the analysis and handle the association 
carefully, we introduce a random walk process and show that this random walk process share an interesting 
connection with NLT model. Next, we introduce the random walk process. 

Model 2 (Random Walk Process (RW)). Consider an information network G = ( V,E,b). For any given 
node v £ V, we define a random walk process as follows. 

• At time t = 0, the walk starts at node v. 

• Suppose at some time t, the current node is u. A node w £ r(u) is chosen with probability b uw . The walk 
moves to node w at time t + 1 Jf] 

Definition 7 (Reaching Event). For any node v, subset C C V and t > 0, we use Bf,(C) to denote the 
event that a random walk starting from v would reach a node in C at precisely time t. 

Next, we show the connection between the NLT model and the RW model for the case when the permanent 
initial set is empty. The more general case (arbitrary permanent initial set) will be considered later. 

Lemma 2 (Connection between NLT model and RW model). Consider an acyclic information network 
G, and let v be a non-void node, and 1 < t < T. On the same network G, consider the NLT process on a 
transient initial set A and the RW process starting at v. Then, E[X*(A)] = Pr[Ef v (A)]. 

Proof. This lemma is the key point in our argument, for which the proof is not obvious. To assist the proof 
of Lemma [2j we next introduce a process called “Path Effect” which is an “equivalent” presentation of the 
NLT model, and devote all the remaining part of this subsection to this lemma. □ □ 

We next introduce the PEprocess which augments the NLT model and defines (random) auxiliary array 
structures Pf known as the influence paths to record the influence history. Intuitively, if v becomes active at 
time t , then the path P* shows which of the initially active nodes is responsible. An important invariant is that 
node v is active at time t if and only if P‘[0] £ A. 

1 Hence, the self-loop at the void node does not really interfere with the acyclic assumption. 

2 Observe that if w is the void node, then the walk remains at w. 
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Model 3 (Path Effect Process (PE)). Consider an information network G = (V,E,b). Each node v £ V 
is associated with a threshold 9 V , which is chosen from (0,1) uniformly at random. 

Given a transient initial set A C V and a configuration of the thresholds 9 = {9 v } ve y, for each node v and 
each time step, the influence paths P* are constructed in the following influence procedure. 

• At time t = 0, for any v £ V, P°[0] = v. 

• At time t > 0, define the active set at the previous time step as A t -1 = {u £ V | Py~ 1 [ 0] £ A}. For each 

node v £ V, we compute f v (A t -i ) := I]«er(- u )nA t _ 1 Then, 

a. If f v (A t - 1 ) > 9 V , choose node u £ T(u) n A t -\ with probability 

b. If f v {At- 1 ) < 9 V , choose node u £ T(t;) \ A t _i with probability \ t t ) - 

Once u is chosen, let P‘[0,..., t — 1] := P* _1 [0,... ,t — 1] and P*[t] = 

Remark 1. Observe that given an information network with a transient initial set A and a configuration 9 
of thresholds. Both of the NLT model and the PE process produce exactly the same active set A t at each time 
step t. 

Definition 8 (Source Event). Consider an information network G on which the PE process is run on the 
initial active set A. For any subset C C.V, we use Iy(C) to denote the event that Pf[ 0] belongs to C. If C C A, 
this event means v’s state at time t is the same as those of the nodes in A at time 0 and hence is active. We 
shall see later on that the event Iy(C) is independent of A and hence the notation has no dependence on A. 

When the given network is acyclic, the PE process has an interesting property. 

Lemma 3 (Acyclicity Implies Independence of Choice). Consider an information network G = ( V , E , b) on 
which the PE process is run with some initial active set. If G is acyclic, for any non-void node v £ V and node 
u £ r(iO> we have Pr[P*[t\ = u | II w] = b vu , where W = T(u). Recall that Uw carries the information about 
the states of the nodes in W at every time step. 

Proof. It is sufficient to prove that, for any value Q in the range of 11^, Pr*[P*[t] = u | II w = Q] = b vu holds. 
Because f v (A t -i) is determined by the states of e’s outgoing neighbors, once Q is fixed, r](Q) = f v {A t - 1 ) is 
determined. We consider two cases. 

(1) When u is in A t -\ according to Q. We have 
p r[Pv[t\ = u\(9 v < p{Q)) n (U w = Q)} = ^y. 

Since u £ A t _i holds, the event {P‘[t] = u} implies that {9 V < r](Q)}. Hence, we have 

Pr[Py[t] = u \ U w = Q\ = Pr[(Py[t] = u) C\ (9 V < r](Q)) \U W = Q] 

= p r[Py[t] =u\(6v < v(Q)) n (n ff = Q)] ■ 

Pr[9 v < r/(Q) | H w = Q]. 

Because G is acyclic, W has no directed path to v. By Lemma [T] we have 

Pr[9 v < ifiQ) | U w = Q] = p(Q). 

Hence, we have proved that Pr[Pf[t\ = u \ Tlw = Q\ = b V u- 

(2) When u is not in A t -1 according to Q. The proof of this case is similar to the previous one. □ □ 

Recall the events ( C ) and P* ( C ) introduced in Definitions [8] and [7] The following lemma immediately 
implies Lemma § with C = A, and using the observation E[A'^(A)] = Pr[A*(A) = 1] = Pr[/*(A)] from 
Remark [I] 

Lemma 4 (Connection between the PE process and the RW process). Suppose the information network 
G is acyclic, and A is the transient initial set. For any C C V, any non-void node v £ V and t > 0, we have 
Pr[/*(C)] = Pr[P(, (C)]. In particular, the probability Pr[/*(C')] is independent of A. 

^Observe that if v is the void node, then P.j, [0,..., t) = \void, ..., void] . 
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Figure 2: The chain of dummy nodes 


Proof. We use induction on t. For t = 0, we have Pr[I®(C)\ = 1 iff v £ C and Pr[P°(C)] = 1 iff v £ C. Hence, 
Pr{I° v (C)}=Pr[R° v (C)}. 

Suppose Pr[/*(C)] = Pr[P* (C)] holds for all v £ V at any time t < k. 

We consider the case t = k and fix some non-void node v. Let W := r(v). Recall that the random object 
U w carries information about the states of all v’s outgoing neighbors at every time step. Let Cjj -1 be the set 
of the values for Uw under which P^ _1 [0] £ C. 

Observing that the events Uw = Q for different Q’s are mutually exclusive, we have 

Pr[I k v {C)}= Y. E Pr[Hw=Q\P^[k}=u]-Pr[P^[k]=u}. 

uer(v) qgc£ _1 


By Lemma|3j we have Pr[P*[t] = u | n w ] = b vu = Pr[P*[t\ = u], which implies for any value Q of n^, it 
holds that Pr\Aw = Q \ P^ [fc] = u] = Pr[n^ = Q\. Consequently, we have 

Pr[I k v {C)} = E E Pr[n w = Q]-Pr[P^[k)=u] 

uer(v) QecS -1 

= E Pr[I k u - X (C)\-Pr[P*[k\=u\. 

u£ r(v) 

By induction hypothesis Pr[/£' -1 (C)] = Pr[Pjj _1 (C)], we get, 

Prtfm = E Pr{R k - 1 (C)]-Pr[P^[k}=u} 

ug r(v) 

= ^ Pr[R k -\C)]b vu . 

The last term is just R*(C), according to the description of the RW process. This completes the inductive 
step of the proof. □ 0 

In the next subsection, we will reduce the general case with non-empty permanent initial set to the case 
when only transient initial set. Furthermore, we can prove the final conclusion (Theorem |6|. 

5.2 Submodularity of Acyclic NLT 

At first, we consider the case where the permanent initial set A is non-empty. We show that this general case 
can be reduced to the case where only transient initial set A is non-empty, by the following transformation. 
Suppose G is an information network, with transient initial set A and permanent initial set A, and T is the 
number of time steps to be considered. Consider the following transformation on the network instance. For 
each node y £ A, do the following: 

1. Add a chain D y of T dummy nodes to the network: starting from the head node of the chain, exactly one 
edge with weight 1 points to the next node, and so on, until the end node is reached. 

2. Remove all outgoing edges from y. Add exactly one outgoing edge with weight 1 from y to the head of 
the chain D y 

See Fig [2] for an example of the chain of dummy nodes. Let D := U ye ^D y be the set of dummy nodes. We call 
the new network G(A) the transformed network of G with respect to A. When there is no risk of confusion, 
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we simply write G. The transformed instance on G(A) only has dUdUD as the transient initial set and no 
permanent initial node. The initially active dummy nodes in D ensure that every node y € A is active for T 
time steps. We use the notation convention that we add an overline to a variable (e.g., A'), if it is associated 
with the transformed network. 

Remark 2. For any non-dummy, non-void node v, 

X t v {A,A) = X t v (AUAuD). 

Lemma 5. Suppose we are given an instance on information network G, with transient initial set A and 
permanent initial set A. Let v be any non-void node in G and 0 < t < T. Suppose in the transformed network 
G(A), for any subset C of nodes in G, R V (C ) is the event that starting at v, the RW process on G for t steps 
ends at a node in C. Then, 

ml(A, A)] = Pr[Rl({u})} + E Pr K(M)]- 

U^A 

Proof. Let v be a non-void node in G and hence cannot be a dummy node in G. By lemma [ 2 J the equation 
E [X*(A, .A)] = E[A l(A UAUD)} implies 

E[XZ(A,A)] = Y, p r[Rl({u})] + Yl (^({2/})]+ E Pr[Rl({w})] 

uGA yGA \ wGDy 

Consider the RW process on G starting at v. For any node y £ A, and consider a node w £ D y that is i 
hops away from y. If i > t, then it is impossible for v to reach w in t steps. Observe that if v reaches w at time 

_ / _ f — 2 

t , then v must reach y at time t — i. Hence, PrfR^dw})] = Pr[R v ({y})], and the summation over i from 1 
to t gives the required formula. □ □ 

Definition 9 (Passing-Through Event). Let G be an information network. For any node v, subset C C V 
and t > 0, we use Sl(C) to denote the event that a RW process on G starting from v would reach a node in C 
at time t or before. 

Lemma 6 (General Connection between the NLT model and the RW model). Suppose G is an acyclic 
information network, and let v be a non-void node, and 1 < t < T. On the same network G, consider the 
NLT model with transient initial set A and permanent initial set A, and the RW process starting at v. Then, 
E[X t v (A,A)] = Pr[R t v (A)US t v (A)}. 

Proof. Without loss of generality, we can still assume A fl A = 0, because X* {A, A) = X l v (A \ A, A) and 
Rl(A) U Sl{A) = Rl(A \ A) U 5* (A). From Lemma [ij we have 

t 

E[x‘dU)] = E + EE Pr K(M)]’ 

uG A yGA 

where the notation R means the corresponding term referring to the reaching event in the transformed graph 
G~G[A]. __ 

We compare the random walks of t steps starting at v on G and on G using a coupling argument. Starting 
at v, the random walk on G copies the random choices made in G. This goes smoothly for the walk on G 
until a node y in A is hit, at which point further random choices made in G are irrelevant. From this coupling 
argument, we can relate the events from G and G in the following way: 

• For u € A, Pr[Rl({u})} = Pr[Rl({u}) \ Si {A)]. 

• For y £ A, E=o P r [Rv({y})\ ^e probability that the walk in G hits y before any other node in A. 

Hence, it follows that on the right hand side of |l]), the first term is Pr[Pr[f?‘ (H) \ Si (H)] and the second term 
is Pr[Sl(A)]. Therefore, their sum is Pr[R l v (A) U S'(,(^4)] J as required. □ J3 
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Theorem 5. (Submodularity and Monotonicity of E[X*(A, A)]). Consider the NLT model on an 
acyclic information network G with transient initial set A and permanent initial set A. Then, the function 
(A, A) i —y E [X(,(A, A)\ is submodular and monotone. 

Proof. For notational convenience, we drop the superscript t and the subscript v, and write for instance 
X(A,A) := A'* (A, A). For the reaching and the passing-through events associated with the Random Walk 
Process in G, we write R(A) := Rf {A) and 5(A) := 5*(A) 

It is sufficient to prove that, for any ACBCV, A CBCV , and node w qL (B U B), the following 
inequalities hold: 

E[X(A U {w}, A)] - E[X(A, A)] > E [X(B U {w}, B)} - E[X(B, B)]; (1) 

E[X(A, A U {w})} - E[X{A, 1)] > E[X(B, B U {w})} - E[X(B, B)}. (2) 

By Lemma[6j for any subsets C and C such that w fL (GUG), i(GU{ro}, C) — x(C, C ) = Pr[i?(CU{rc})U5'(G)] — 
Pr[R{C) U5(G)] = Pr[i?({w}) \ 5(G)], where the last equality follows from definitions of reaching and passing- 
through events. Hence, inequality ([!]) follows because AC B implies that R({w}) \ 5(A) A R({w}) \ S(B). 

Similarly, E[A(G,G U {«>})] - E[X(G, C)\ = Pr[5({u>}) \ (P(G) U 5(G))]. Hence, inequality 0 follows 

because R{A) U 5(A) C R(B) U S(B). □ D 

Corollary 1 (Objective Function is Submodular and Monotone). With the same hypothesis as in Theorem [d| 
the function [A, A) >■ cf(A, A) is submodular and monotone. 

At the end, we achieve the main result of this paper. 

Theorem 6. Given an acyclic information network, a time period [1,T], a budget K and advertising costs 
(transient or permanent) that are uniform over the nodes, an advertiser can use the Standard Greedy Algorithm 
to compute a transient initial set A and a permanent initial set A with total cost at most K in polynomial time 
such that a (A, A) is at least ^ of the optimal value. Moreover, there is a randomized algorithm that outputs A 
and A such that the expected value (over the randomness of the randomized algorithm) of a(A, A) is at least 
1 — \ of the optimal value, where e is the natural number. 

Proof. We describe how Theorem [6] is derived. Recall that the advertiser is given a budget K , and the cost 
per transient node is c and the cost per permanent node is c. Observe that if the advertiser uses k transient 
nodes, where k < then there can be at most k := [ K f kc \ permanent nodes. Hence, for each such guess 

of k and the corresponding k, the advertiser just needs to consider the maximization of the submodular and 
monotone function (A, A) i —> a(A,A) on the matroid {(A, A) : |A| < k,\ A\ < k}, for which ^-approximation 
can be obtained in polynomial time using the techniques of Fisher et al. [12] , A randomized algorithm given by 
Calinescu et al. [3j achieves (1 — ^-approximation in expectation. □ □ 
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